Maximum-entropy Monte Carlo method for the inversion of the structure factor in 

simple classical systems* 
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We present a method for the evaluation of the interaction potential of an equilibrium classical 
system starting from the (partial) knowledge of its structure factor. The procedure is divided into 
two phases both of which are based on the maximum entropy principle of information theory. First 
we determine the maximum entropy estimate of the radial distribution function constrained by 
the information contained in the structure factor. Next we invert the pair function and extract 
the interaction potential. The method is tested on a Lennard- Jones fluid at high density and the 
reliability of its results with respect to the missing information in the structure factor data are 
discussed. Finally, it is applied to the experimental data of liquid sodium at 100° C. 
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I. INTRODUCTION 

The radial distribution function (RDF) of an equilib- 
rium statistical system contains useful information con- 
cerning its physical properties. Indeed, at least for 
systems governed by pairwisc additive interactions, its 
knowledge allows one to compute the ensemble average 
for observable quantities such as internal energy and pres- 
sure. Furthermore, if the same hypotheses are satisfied, 
the RDF is in one-to-one correspondence with the mi- 
croscopic interaction potential [1, 2] and represents the 
starting point for the solution of the so-called "inverse 
problem" of statistical mechanics [3-8] . 

Despite its central role in the analysis of a statisti- 
cal system the RDF is not directly accessible from the 
experiments and its estimation passes through the mea- 
surement of the structure factor. This former quantity 
is formally related to the RDF by an inverse Fourier 
transform, which for a homogeneous and isotropic sys- 
tem reads: 
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where g(r) and S(k) are the RDF and the structure fac- 
tor, respectively, and p is the density. So the measure- 
ment of the RDF is reduced to the evaluation of the inte- 
gral appearing in Eq. (1). Unfortunately this procedure, 
although conceptually correct, cannot be directly applied 
due to some typical limitations in the measurement of the 
S(k). Indeed, the experimental information is obtained 
by an analysis of the x-ray and/or neutron diffraction 
data. These techniques provide results over a finite k 
range and a number of nontrivial corrections on measured 
data are needed. So the resulting experimental structure 
factor turns out to be incomplete and typically spoiled 
by systematic and statistical errors. As a consequence, 
the RDF obtained by means of Eq.l (1) may present non 
physical features and spurious structures could emerge. 



* http://link.aps.org/doi/10.1103/PhysRevE.84.041130 
t e-mail address: marco.dalessandro@isc.cnr.it 



In order to overcome these difficulties different ap- 
proaches have been pursued. A promising class of solu- 
tions is provided by simulation assisted methods in which 
a molecular dynamics or a Monte Carlo (MC) simulation 
is driven with the aim to minimize the differences be- 
tween the simulated structure factor and the experimen- 
tal data. Among the results belonging to this class we 
cite the reverse Monte Carlo technique, proposed by Mc- 
Greevy and Pusztai in [9], which implements the transi- 
tion probability on the basis of the \ 2 function between 
the reference and the simulated structure factors; this 
procedure, however, does allow one to determine the pair 
interaction potential. Further approaches are provided 
by the method proposed by Toth [10] and based on the 
previous work of Lyubartsev and Laaksonen [5] , and by 
the solution due to Almarza, Lomba, and Molina [11]. 
These methods consist in an iterative procedure for the 
evaluation of an effective pair potential compatible with 
the experimental data, so they attempt to provide a true 
solution of the inverse problem starting from the struc- 
ture factor. A comprehensive review of the simulation 
assisted methods is given in [12]. 

Since we are dealing with the reconstruction of the 
RDF starting from the partial knowledge of the experi- 
mental S(k) one question concerning the uniqueness of 
the solution naturally arises; at the same time it is de- 
sirable that no information besides that contained in the 
structure factor is transferred to the RDF during the re- 
construction. Both of these issues can be addressed using 
the maximum entropy principle (ME) [13] as the guide- 
line for the definition of the reconstruction procedure. 
Indeed ME has the remarkable feature of producing the 
highest entropy solution compatible with the given con- 
straints, so the corresponding estimate for the RDF is 
"maximally noncommittal with regard to the missing in- 
formation" [13]. ME-based algorithms for the inversion 
of the structure factor were first developed by Root, Egel- 
staff and Nickel [14] and by Soper [15]. In these papers 
it has been shown that the adoption of ME improves the 
Fourier transform of the structure factor data and re- 
duces the spurious structure in the RDF. ME has been 
introduced for the first time in contest of the inverse prob- 
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lem by Cilloco in [16]; the method described in this paper 
used ME inside a Monte Carlo simulation scheme. It has 
been shown that a maximum entropy ensemble of config- 
urations compatible with a given reference RDF can be 
built adopting a suitable definition of the transition prob- 
ability between neighbor states. This approach has been 
recovered and extended in [8]; the transition probability 
has been reinterpreted as an information-based feedback 
controller and an "integral term" has been added. The 
authors evidenced that this quantity converges to the in- 
teraction potential thus providing a ME-based solution 
of the inverse problem. 

The purpose of this paper is to present a ME Monte 
Carlo method for the inversion of the experimental struc- 
ture factor. The procedure is mainly based on the statis- 
tical properties of the pair distribution functions both in 
the r and in the k space. We will show that, thanks to 
the above mentioned features of ME, the algorithm pro- 
vides a reliable reconstruction of the RDF starting from 
a limited knowledge of the experimental structure factor. 
Once the S(k) has been inverted, we can apply the tech- 
nique described in [8] to the resulting RDF and extract 
the (pair) interaction potential. 

The paper is organized as follows. Section II contains 
a detailed theoretical description of our procedure. In 
Sec. Ill we test the method for a Lennard- Jones fluid as- 
suming different cutting points of the input S(k) and we 
invert the experimental data of liquid sodium at 100°C. 
Finally, in Sec. IV we discuss our results and present 
some concluding remarks. 



II. THEORY 

We present a statistical description of a simple 
monoatomic fluid, in an analogous way of what has been 
done in [8] , and extend this analysis to the Fourier trans- 
form of the RDF. Then we describe a procedure for the 
construction of a ME ensemble of configurations con- 
strained by the (partial) knowledge of the structure fac- 
tor. 



A. Preliminaries 

We define the notion of a model system, which is a ho- 
mogenous and isotropic collection of pointlike elements 
with average density p. Given an arbitrary configuration 
x of the model we define two quantities that will be rel- 
evant for the subsequent analysis: the local sampling of 
the elements pair function (PF) n and its Fourier trans- 
form h. The former quantity provides the number of 
particles rii inside the ith spherical shell of width Sr cen- 
tered on a reference element; the sampling is performed 
up to the maximum value tm and consequently the in- 
dex i runs from 1 to N = tmISt. The Fourier transform 



(FT) of the local PF is defined through the equation 

%=^>) = E^^n i; j = l,-,N (2) 
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where is the value of the radius associated to the zth 
shell: ri — iSr. Given the local pair function n, its FT 
n represents an N elements vectors in the k space. The 
component nj contains the fc-space value at kj = jSk; 
the sampling is performed with a uniform step of width 
Sk, chosen according to the relation 

SrSk = £ (3) 

We also introduce the notion of inverse Fourier transform 
(IFT). Given a /c-space vector n we define its IFT through 
the equation 

"i=^- 1, (») = ^E^^^»i (4) 

j=l 3 1 

Equation (3) ensures the orthonormality of the discrete 
basis of functions adopted in Eqs. (2) and (4), namely, 

A N 

sm(%i) sin(fcj-r;) = —5 U (5) 
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so the transformation of a PF from the r space to the k 
space and back again will reproduce the initial function 
[17]. It is worth mentioning that, according to Eq. (3), 
a sampling of width Sr in the r space produces a k space 
vector with a maximum wave number given by ku — 
Ti/Sr, in agreement with the Shannon-Nyquist sampling 
theorem [18]. 

Since we are interested in the construction of the av- 
erage pair functions (both in the r and k space) we have 
to extend the notion of local PF and of its FT to a large 
number of configurations. So we introduce a probabil- 
ity function p(x) defined upon the model configuration 
space and we collect an ensemble of s elements extracted 
according to p. The global samplings over the ensemble 
are defined as the average values of the local ones: 
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where the index a labels the elements of the ensemble 
and the last equality holds due to the linearity of the 
FT. The radial distribution function and the structure 
factor of the model system are defined starting from the 
global quantities (6) in the limit s — >• oo. The RDF is 
obtained by normalizing the global PF built on the model 
ensemble with the pair function of a uniform reference 
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system (perfect gas) with the same density of the model 
one: 
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where m- P9> = AnprfSr is the perfect gas pair function. 
The structure factor is defined in terms of the FT of the 
global pair function via the relation 



S(k j ) = l + m j 



(8) 



This definition provides the usual notion of S(k) for an 
isotropic statistical system, indeed making use of Eqs. 
(2) and (7) and performing the continuum limit gives 
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which is the formal definition of structure factor adopted 
in statistical mechanics. We observe that, due to the 
finite size of the model system, the integral in Eq. (9) 
extends up to the maximum sampled value of the model 
RDF. Consequently, according to the Shannon-Nyquist 
sampling theorem, the maximum allowed k resolution is 
given by ir/r M . 

We conclude this preliminary section by introducing a 
useful notation for dealing with the Fourier transforms. 
Since the FT and its inverse are realized as linear com- 
binations among the n and the n variables, respectively, 
we can write 
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where according to Eqs. (2), (3) and (4) both and its 
inverse are symmetric matrices given by 
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The sum in Eqs. (10) has been restricted to the first N— 1 
elements since the last one gives a zero contribution for 
algebraic reason related to the definition of the matrices 
(11). We point out that Eq. (5) ensures that the matrix 
product of these quantities gives the identity matrix as 
expected. 



B. Analysis of the model distribution function 

We are interested in computing the probability distri- 
bution of the FT of the global pair function built on the 
model ensemble. So we suppose that an ensemble of con- 
figurations has been extracted according to a given prob- 
ability distribution p(x) and we compute the probability 
associated to a particular sampling m as a function of 
the parameters of the underlying ensemble distribution. 

To achieve this task we start from the probability of 
the local sampling of the pair function n. The ith shell 



of the PF follows a Poisson distribution with expectation 
value fj,i [8]; we assume that the system exhibits a hard 
core (HC) structure with radius r so that the expected 
number of particles \ii is zero for i lower than the thresh- 
old value N a = r /Sr and is strictly positive otherwise. 
Since there is no correlation among different shells the 
complete distribution is obtained as the product of the 
single shell values and reads 
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The HC structure of the reference distribution imposes 
that is zero if there is some rii > for i < N . The 
FT of the local sampling of the pair function n is defined 
through a linear combination of the n variables (10), so 
its expectation value and its covariance can be expressed 
in terms of the /z; parameters: 

E (%) = P-3 = E c ^* 

i 

Cov (nj,n k ) = £jk ^^CijdkfH (13) 

i 

We observe that, due to the linear combination (10), the 
covariance matrix of the n variables is not diagonal even if 
the original variables n are uncorrclated. The variable 771 
is defined as the average of the n^ a ' (6), so it has the same 
expectation value p, and a covariance given by £/s. Its 
asymptotic distribution can be computed using a multi- 
variate central limit theorem; this theorem states that the 
distribution function of the reduced variable ^fs(fhj—Jij) 
converges, in the limit s — > 00, to a multivariate Gaus- 
sian with zero mean and covariance given by £. So we 
have 



y/s(m — fi) <~ J\f 



where: 



K (x) = 



(2^) N/2 |CI 



r («r 



1/2 



(14) 



(15) 



is the multivariate distribution function with zero mean 
and |£| represents the determinant of the covariance ma- 
trix. It turns out that the elements of 771 are linear depen- 
dent and consequently the covariance matrix is singular. 
This is due to the fact that only the nonzero components 
of the local PF contribute to the linear combination (10), 
so the number of independent elements of 771 is N — N„ . 
In order to avoid a singular covariance matrix we have to 
restrict our analysis to a set of independent elements of 
77i ; in this domain the covariance matrix can be inverted 
and its inverse reads: 
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(16) 



where the indices j, fc run from 1 to N — N„ and the tilde 
indicates that the matrices are restricted to the subset of 
independent variables. 
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This analysis shows that the asymptotic distribution 
for the independent subset of the rh components is de- 
scribed by a multivariate Gaussian distribution A/p with 
mean p, and inverse covariance s ■ We observe that 
both the expectation value (13) and the inverse covari- 
ance matrix (16) are functions of the parameters of the 
original distribution function (12). 



C. Maximum entropy approach to the inverse 
problem 

We consider a monoatomic system and assume that for 
a given density p and temperature T the structure factor 
S t (k) of the system is known up to the maximum value 
k M . We will refer to this system as the target. 

The aim of this section is to define a procedure for the 
evaluation of an equilibrium model distribution function 
p(x) compatible with the information contained in the 
target structure factor. The method is based on the max- 
imum entropy principle and is realized inside a Monte 
Carlo simulation scheme. MC represents an effective tool 
to pursue this approach: the maximization of configura- 
tional entropy is produced by the MC random movements 
for the construction of the trial configurations (source of 
entropy) while the transition probability among neighbor 
states selects the configurations and acts as a source of 
information. At equilibrium these two mechanisms are 
in balance, the net amount of information loss is zero, 
and the system approaches a state of maximum entropy 
consistently with the given constraints. 

The main advantage of this kind of procedure is that 
the ME solution is sought in terms of a "real" physical 
system which possesses a true configuration space beyond 
the two-body pair function; so its equilibrium distribu- 
tion implicitly defines the correlation functions of any 
order. Inside this scheme, the implementation of the ME 
algorithm realizes the maximization of the whole con- 
figurational entropy and not only of the two-body con- 
tribution. This method provides the maximum entropy 
estimate of the complete equilibrium distribution of the 
model system and the ensemble of configurations built 
according to it can be used to compute the average value 
of any quantity of interest. 

Since, under this perspective, the transition probabil- 
ity is the natural object in which the knowledge on the 
system is codified, we seek this quantity with the aim of 
building a model distribution function that produces an 
expectation value of m consistent with the target refer- 
ence value fit (the proper definition of this parameter on 
the basis of the target S(k) will be discussed in the next 
section) . To achieve this task we make use of the method 
developed in [8, 16] and we maximize the log-likelihood 
function between the model pair function and the target 
reference value. This choice is based on statistical rea- 
sons: in the limit of a large number of configurations the 
average rh computed over the model ensemble converges 
to the expectation value p, of the model distribution func- 



tion and the log-likelihood can be related to the relative 
entropy D (Kullback-Leibler divergence [19]) between the 
model and the target distributions: 

In^(m) = -£>(^||^) (17) 

so the maximization of the log-likelihood function is 
asymptotically equivalent to the minimization of the rel- 
ative entropy (17). Given two distributions p and q, the 
relative entropy D(p\\q) is positive definite and vanishing 
only if p = q, so a complete maximization of the likeli- 
hood function implies the equality of the distributions. 

So our general strategy is the following: we perform a 
MC simulation using a transition probability which max- 
imizes the log-likelihood function defined above. This 
procedure builds a maximum entropy ensemble of con- 
figurations constrained by the target S(k) and the radial 
distribution function computed over this ensemble is the 
maximum entropy estimate of the inverse Fourier trans- 
form of the target structure factor. Since the maximum 
entropy principle has the feature of providing reliable es- 
timates on the basis of a partial input of information, we 
expect that this procedure should be able to produce a 
correct reconstruction of the radial distribution function 
starting from a limited knowledge of the structure factor. 

In the next section we will describe some details of the 
implementation of this procedure. The applications of 
the method and some checks of its reliability and sensi- 
tivity to the amount of missing information are discussed 
in Sec. III. 



D. Maximization of the log-likelihood function 

Assume that the St(k) has been measured with a reso- 
lution Sk up to the value kM', so the target input is given 
by N t = kM /Sk samplings of the structure factor. 

The first step consists in a proper definition of the 
model system (see Sec. II A): the value of the model 
density is chosen equal to the target one and the model 
pair function is sampled up to the maximum value rjy = 
ir /5k. This choice ensures that the model structure fac- 
tor is sampled with the same resolution as the one of the 
target system. The spatial resolution 6r in the model 
configuration space is chosen with the double task of pro- 
ducing an accurate sampling of the model RDF and to 
ensure that the maximum sampling value of the model 
structure factor (given by n/5r) is greater than the target 
value ku- 

We define the procedure for the construction of the 
model ensemble based on the maximization of the log- 
likelihood function described in the previous section. 
First we build the reference distribution on the basis of 
available information concerning the target structure fac- 
tor [see Eq. (8)]: 

iHj=fnf' ) + S t {k j )-l (18) 

where j extends over all the shells in the model system 
(from 1 to N = Tm/St) and we impose that St{kj) is 
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equal to 1 for all j > N t . Given the jit we compute 
its inverse Fourier transform /i t which represents the ex- 
pectation value of the target pair function. Due to the 
lacking information in the target structure factor the fi t 
provides a biased reconstruction of the true target pair 
function; typically this function exhibits nonphysical be- 
havior such as, for instance, strong oscillations inside the 
hard core radius. 

Next we analyze the construction of the log-likelihood 
ratio. Assume that we have performed s MC iterations. 
For each point of the path we compute a local sampling 
of the PF and its FT fS a ^ and we construct the 
global pair functions m and fh. Then we select a ref- 
erence particle and compute a local sampling of the PF 
n (1) ; at the same time the particle is randomly moved 
and the new local sampling of the PF is stored in the 
array n <2) . In this way we obtain two different samplings 
of fh at the level s + 1, namely fh {1) and m <2) . Then we 
perform a cut in the model system consistently with the 
missing information in the target reference function: so 
we substitute the perfect gas value in both the fh sam- 
plings for j > N t . This procedure provides rh cut{1> and 
fh cut (2) which are the natural quantities to be compared 
with jl t . Finally we define the log-likelihood ratio via the 
relation 



provides: 
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where m (bias> represents the inverse Fourier transform of 
m cut . We see that, once reformulated in the configura- 
tion space, the log-likelihood becomes diagonal: the zth 
term in the sum (21) is weighted by the ith shell value of 
the discrepancy between the model and the target global 
(biased) PF. This behavior is a consequence of the inde- 
pendence between the local pair functions related to dif- 
ferent shells [see Eq. (12)]. It is worth mentioning that 
Eq. (21) is strongly reminiscent of the log-likelihood ra- 
tio computed in [8, 16] starting from the knowledge of 
the target RDF. Indeed we recognize that the weighted 
difference between ra {hias) and \i t is the first order expan- 
sion [21] of ln(m (6it " ) //u t ). Furthermore, if the complete 
target structure factor is provided, then the input of in- 
formation content becomes equivalent to the knowledge 
of the target RDF; in this case nt represents the true 
value of the target PF and the rn (bias) coincides with the 
model global PF m thus providing an identical expression 
of the log-likelihood ratio. 
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The transition probability selects all the trial configura- 
tions which maximize the likelihood function (<5A < 0) 
[20] and consequently the distribution of the fh cut con- 
verges to a multivariate Gaussian defined by the target 
parameters fi t and /i t [see Eqs. (13) and (16)]. At the 
same time the model global sampling m converges to the 
unbiased reconstruction of the target RDF and its FT fh 
builds the complete target structure factor. So, thanks 
to the ME approach, we build a complete estimate of 
the target distribution function on the basis of a limited 
amount of information. 

We conclude this section by analyzing the expression 
of the log-likelihood ratio. In the limit of a large number 
of configurations we can expand Eq. (19) in power of s. 
The leading order contribution reads: 



SX 



N t 
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Equation (20) evaluates the log-likelihood ratio as the 
weighted sum of the differences between the actual and 
the trial local sampling n. The weights are proportional 
to the discrepancy between the global fh cut and the tar- 
get reference function and, due to the non diagonal cor- 
relation matrix, each term in the sum depends on the 
whole difference (fh cut — fit)- It is interesting to recast 
this equation in terms of the local PFs in the r space; to 
obtain this result we make use of Eqs. (10) and (16); this 



E. A remark on the transition probability 

In Sec. II D we stated that the transition probability is 
defined in a way to accept all the trial configurations with 
a log-likelihood ratio lower than zero (SX < 0). In or- 
der to better comprehend the reasons behind this choice 
it is useful to briefly recall the main results concerning 
the analysis of the transition probability described in [8] . 
Following the approach outlined in this reference we can 
interpret Eq. (21) has a proportional feedback controller 
which selects the model system configurations on the ba- 
sis of the "error" e = (m"" as) — Htj/^t between the target 
and the model global pair function. This interpretation 
suggests the formulation of an improved expression for 
SX, based on the theory feedback systems, that also in- 
clude an integral term apart from the proportional one; 
this latter quantity keeps into account all the errors in 
the steps preceding the actual one. In this way we realize 
a proportional-integral controller, schematically defined 
as: 



SX 



E 



kpCi 



where k v and kj are the coefficients of the proportional 
and integral term, respectively. The transition probabil- 
ity is defined as min{l, exp(— SX)} and the proportional 
and integral coefficients are fixed with the aim to ensure 
the correct fluctuation of the model PF around its av- 
erage value. Results reported in [8] evidence that this 
approach allows one to build the correct equilibrium dis- 
tribution of the target system; furthermore, the inter- 
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action potential emerges as the asymptotic limit of the 
integral term. 

The procedure delineated above can be applied in the 
present case and would allow one to directly extract the 
interaction potential from the knowledge of the structure 
factor. Instead, we have adopted a different implementa- 
tion of the feedback controller in which the integral term 
is absent and the proportional coefficient is virtually infi- 
nite: so only the trial configurations with a log-likclihood 
ratio lower than zero are accepted. The main advantage 
of this choice is that a pure proportional controller en- 
sures a straightforward and stable convergence of the in- 
version procedure, so the RDF is obtained without the 
need of setting any parameters. In this way we can per- 
form an intermediate check of the inversion procedure. 
Obviously, the extraction of the interaction potential re- 
quires the subsequent inversion of the RDF using the 
method described in [8]. 

III. APPLICATIONS 

The technique previously described has been applied 
to a simple Lennard- Jones fluid and to the experimen- 
tal structure factor data of the liquid sodium at 100°C 
measured by Greenfield, Wellendorf and Wiser in [22]. 

The inverse MC simulation is realized in the NVT en- 
semble: the model configuration space is a cubic volume 
of linear length L with N p pointlike particles and the pe- 
riodic boundary conditions together with the minimum 
image convention have been adopted. 

The transition probability between neighbor states has 
been evaluated by using Eq. (21) for the computation of 
the log-likelihood ratio. This formula requires knowledge 
of the HC radius which is a priori unknown; a brief esti- 
mate of its value can be obtained (as suggested by Reatto 
in [4] ) by computing the inverse FT of the structure fac- 
tor and by taking a fraction of the r position of its first 
peak. ME will allow this estimate to be corrected to its 
optimal value during the simulation. 

A. Results for the Lennard-Jones system 

We consider a system described by the Lennard-Jones 
potential with argon-like parameters a = 3.405 A and 
e/kB = 119.76 K. The target structure factor is evaluated 
by performing a Metropolis MC simulation on a system 
of 864 particle at the reduced density p* = per 3 = 0.84 
and reduced temperature T* = ksT/e = 0.75, near the 
triple point. The simulation run for 2 x 10 4 cycles after 
equilibration. The g(r) has been evaluated up to r* = 
r/a = 7.05 (24 A), the width of the shells for the measure 
of the g{ri) was 5r = 2.4 x 10~ 2 A, and the number of 
measured points was 10 3 . The structure factor has been 
evaluated using the procedure described in Sec. II A; the 
k resolution is given by Eq. (3) and is equal to 8k = 0.13 

A- 1 . 



Once the target S(k) was computed we performed the 
inversione procedure for the reconstruction of the RDF. 
In order to check the sensitivity of this approach we trun- 
cated the target S(k) at different values of k and we pro- 
ceeded to the reconstruction for each of the truncated 
function. So we built three structure factors, namely, 
Sti(k) (truncated at ku = 13 A -1 ), S t2 (k) (truncated at 
k.M = 6.5 A -1 ), and S t3 (k) (truncated at ku — 3.2 A" 1 ). 
Then the reconstruction procedure started for 2 x 10 4 cy- 
cles after equilibration. In this way we produced three 
radial distribution functions, namely, </i(r), g 2 (r), g 3 (r) 
and the corresponding structure factors -Si (A:), S 2 (k) and 
S 3 (k). 

Results are reported in Fig. 1. The first line contains 
the outcomes of the inversion starting from St i (k) . The 
maximum difference between the target and the model 
structure factor for k up to ku is about 4 x 10~ 4 and 
the procedure reconstructed the target S(k) for k > ku 
with an error lower than 1 x 10~ 3 ; the model RDF re- 
produces the target values with a maximum difference 
of about 2 x 10~ 2 . The second line reports results ob- 
tained using the information content of S t2 (k). Even in 
this case the maximum discrepancy up to ku (6.5 A -1 ) 
is about 4 x 10~ 4 and the procedure reconstructed the 
target structure for k > ku with an error lower than 
1 x 10~ 2 ; the maximum difference between the target 
and the model RDF is about 5 x 10~ 2 . Finally, in the 
last line of Fig. 1 we present the results of the inversion of 
St 3 (k). The maximum difference between the target and 
the model structure factor up to ku (3.2 A -1 ) is about 
4 x 10~ 4 as in the previous cases and, despite the mod- 
est information content of the target structure factor, the 
model S(k) reproduces the target one for k > ku with 
an error lower than 4 x 10~ 2 . The corresponding RDF 
reconstructs the target function with a maximum dis- 
crepancy of about 6 x 10~ 2 . This analysis evidences the 
effectiveness of the maximum entropy principle to pro- 
vide accurate reconstructions on the basis of a limited 
amount of information. 

In order to complete the inversion procedure we have 
evaluated the interaction potential associated to the 
three RDFs computed above. These results have been 
obtained by using the method described in [8] and are 
presented in Fig. 2. The analysis of this figure shows that 
the potentials extracted from g x {r) and g 2 (r) are essen- 
tially equivalent and provide a good estimate of the target 
one (with a maximum discrepancy of about 5 x 10~ 2 dis- 
tributed over the r axis). The potential extracted from 
g 3 (r) is less accurate with respect to the previous ones. In 
this case the main features of the target potential (such 
as the amplitude and location of the absolute minimum) 
are reproduced correctly but some spurious oscillations 
are present. This behavior is a consequence of the pres- 
ence of small oscillations in the RDF g 3 (r) which are 
hardly visible at the scale of Fig. 1. This fact indicates 
that a very precise reconstruction of the target RDF is 
needed in order to obtain a correct solution of the inverse 
problem. 
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(c) 

FIG. 1. (Color online) Results of the inversion procedure for a Lennard- Jones system. The left column contains the plots of 
the structure factor: continuous line for the target S(k) used as model input, dotted line for the target S(k) in the k region 
beyond kM, and filled circles for the model S(k). The right column contains the radial distribution functions: continuous line 
for the target RDF and filled circles for the model RDF. (a) Target S(k) truncated at 13 A -1 ; (b) target S(k) truncated at 6.5 
A -1 ; and (c) target S(k) truncated at 3.2 A -1 . 



B. Inversion of the Na data 



We present the result of the inversion of the structure 
factor of the liquid Na at 100 °C [22]. Since we are dealing 



with a real fluid at high density we expect that the many- 
body contributions in the interaction potential cannot 
be neglected, so our solution of the inverse problem will 
produce an effective pair potential. 

Experimental data have been measured with a vari- 
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r* 

FIG. 2. (Color online) Results for the Lennard- Jones system. 
The target potential (continuous line) and the model potential 
[circles for gi(r), squares for (^(f) and triangles for g 3 (r)] are 
plotted. 

able k resolution up to 8.9 A -1 . In order to apply the 
inversion technique defined in Sec. II we need a target 
S(k) sampled uniformly with a 5k value compatible with 
the size of the simulation box; so a preliminary opera- 
tion on experimental data is needed. Our prescription 
is the following: we perform the inverse Fourier trans- 
form of the experimental S(k) and compute the (biased) 
RDF; then we "cut" this function at a value r M consis- 
tent with the linear dimension of the model system in 
which wc will perform the inversion procedure. Finally, 
we transform back to the k space and compute the new 
structure factor which is ready to be used as the target 
input function. The reliability of this method has been 
tested for a Lennard- Jones system in which the evalua- 
tion of the target S(k) and the inversion procedure for 
the reconstruction of the RDF have been performed in 
boxes of different linear length. In all the cases we have 
obtained a correct reconstruction of the target RDF. For 
the present case of the Na data we have chosen r M = 22 
A which corresponds to the maximum sampled value for 
a model system made of 864 particles. 

The inverse simulation procedure for the reconstruc- 
tion of the Na radial distribution function took 2 x 10 4 
cycles after equilibration. The result for the RDF is re- 
ported in the left panel of Fig. 3. This function evidences 
a HC radius of 2.65 A and the first peak is located at 
r = 3.72 A and is equal to 2.32. It is interesting to 
compare our result with the RDF obtained in [4] using 
an iterative method for the inversion of the Na structure 
factor. The two functions are in substantial agreement: 
the RDF of [4] has a HC radius of 2.7 A, whereas the first 
peak is located at r = 3.66 A and is equal to 2.43; fur- 



thermore even the relative positions of the other minima 
and maxima differ less than the 2%. 

The Na pair effective potential has been extracted from 
the RDF computed above by using the method described 
in [8]. The result is presented in the right panel of Fig. 
3. The potential reported in the figure evidences a highly 
repulsive part in the low r region; then there is an attrac- 
tive zone with the minimum located in r = 4.05 A and 
equal to —0.91 and a further weak repulsive part with a 
local maximum at r = 5.45 A. Finally, the potential ap- 
proaches zero with some smooth oscillations. Again we 
compare our solution with the one obtained in [4]. We 
observe that the shapes of the two potentials are in qual- 
itative agreement but a quantitative comparison reveals 
some differences: in particular, the locations of the abso- 
lute minima coincide but the depth of the potential wells 
differ by about 15%. This fact has to be interpreted on 
the basis of the high sensitivity of the inverse problem 
on the input RDF, so minor differences among the RDFs 
could produce a sensible discrepancy at the level of the 
interaction potentials. 



IV. DISCUSSION AND CONCLUSIONS 

We have presented a method, based on the maximum 
entropy principle of information theory, for the recon- 
struction of the radial distribution function of an equi- 
librium statistical system starting from the partial knowl- 
edge of its structure factor. The procedure is realized in- 
side a Monte Carlo simulation scheme which is revealed 
to be an effective tool for the implementation of the ME; 
indeed the maximization of the configuration entropy is 
realized by the MC random displacements whereas the 
transition probability between neighbor states is defined 
consistently with the information input codified in the 
target S(k). Once the RDF has been computed we 
can derive the two-body effective potential by using the 
method defined in [8], thus providing a true ME-based 
solution of the inverse problem. 

As stated in Sec. II C, the realization of the ME ap- 
proach inside a MC-based procedure presents some inter- 
esting features. Indeed, this method realizes a complete 
maximization of the model system configurational en- 
tropy (beyond the two-body term) and provides the max- 
imum entropy estimate of the complete equilibrium dis- 
tribution of the model system. So within this approach it 
is possible to extract informations concerning the physi- 
cal system under inspection that goes beyond the simple 
improvement of the Fourier transform of the structure 
factor. Furthermore, since the correlators are obtained 
through the ensemble average over the model system con- 
figuration space, any non physical feature (such as, for 
instance, negative values for the RDF inside the hard 
core region) is automatically avoided. 

The applications of the method are presented in Sec. 
III. Results analyzed in the first part of this section are 
designed to test our approach with respect to the missing 
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FIG. 3. Results of the inversion procedure for Na at 100° C. The left panel contains the plot of the radial distribution function. 
The right panel contains the two-body effective potential. 



information in the input structure factor. ME has the 
feature of being "maximally noncommittal with regard 
to the missing information" [13], and indeed, the results 
discussed in Sec. Ill A demonstrate a reliable reconstruc- 
tion of the system RDF even for very limited knowledge 
of the S(k). Finally, Sec. IIIB contains the analysis of 
the real experimental data of the liquid sodium at 100 
°C. We evaluated the Na RDF and then we extracted 
the effective pair interaction potential; both of the pro- 
cedures converged to a stable result. The solution of 
the inverse problem for this system has been compared 
with the one presented in [4] . The discrepancies between 
the two potentials have been motivated in terms of the 
(small) differences among the RDFs. It is well known 
that the solution of the inverse problem is highly sensi- 
ble to the details of the pair function used as the input 
of the reconstruction procedure. So, under this perspec- 
tive, the adoption of the maximum entropy principle as 
a general and solid guideline for the definition inversion 
procedure could guarantee the correctness of the results. 

A last comment concerns the possible extensions of the 
technique described in the present paper. ME principle 
holds for any system at equilibrium, so the main idea at 
the basis of this approach can be extended to systems 
other than the simple monoatomic fluid discussed in the 



present paper. For instance, polyatomic fluids are often 
characterized by strong directional interactions and an ef- 
fective description of their physical properties in terms of 
the model system defined in this paper could be revealed 
as very crude. In these cases, however, it is possible to 
define an improved model system with new degrees of 
freedom which provide a better match with the ones of 
the experimental system under inspection. The statisti- 
cal analysis presented in Sec. II has to be extended in 
order to include these new degrees of freedom and the 
same kind of procedure based on the maximization of 
the log-likelihood ratio can be performed. Obviously, the 
feasibility of this strategy requires a higher involvement 
of information concerning the target system and further 
experimental data, beyond the two-body pair function 
has to be provided. 
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